Lightweight Modeling of Java Virtual Machine Security Constraints using Alloy

نویسنده

  • Mark C. Reynolds
چکیده

The Java programming language has been widely described as secure by design. Nevertheless, a number of serious security vulnerabilities have been discovered in Java, particularly in the component known as the Bytecode Verifier. This paper describes a method for representing Java security constraints using the Alloy modeling language. It further describes a system for performing a security analysis on any block of Java bytecodes by converting the bytes into relation initializers in Alloy. Any counterexamples found by the Alloy analyzer correspond directly to insecure code. Analysis of a real-world malicious applet is given to demonstrate the efficacy of the approach. Introduction. This paper will describe an analysis tool for verifying security constraints within Java bytecodes. This investigation was motivated by the continued appearance of malicious Java code that violates the security constraints imposed by the Java compiler, the Java Bytecode Verifier and the Java runtime. The analysis approach is based on the lightweight modeling language Alloy [AL, DJ]. This paper will describe the security verification approach taken by the Java Virtual Machine (JVM), and briefly enumerate some of the ways that it has been circumvented. A review of the top level goals of this work will then be presented, followed by a hierarchical description of the design of the analysis tool and its implementation. Results will then be presented in detail using a real-world example of malicious code: the BlackBox applet. Finally, a path toward future work on the analysis tool will be described. The analysis tool has, in fact, proven to be a powerful approach to analyzing JVM security constraints. The approach of applying lightweight modeling as a means to check JVM security constraints appears to be a new approach. Background. The Java programming language has been touted as “secure by design” since its inception. However, attacks against Java security have been promulgated from the earliest days of Java. Felten discovered several weaknesses in the Java security model almost immediately, and his work on Java [FE] contains an extensive list of early exploits. The development of Java malware has continued unabated up to the present. The Common Vulnerabilities and Exposures project [CV] lists numerous Java bugs that can lead to privilege escalation, sensitive data exfiltration, denial of service and other malicious outcomes. Of particular note is the BlackBox malicious Java applet [BB, LS]. This applet exploits a number of Java security weaknesses, and was widely deployed, infecting thousands of machines. The BlackBox applet not only breaks out of the supposedly invulnerable sandbox that the Java applet runtime imposes, it also manages to escalate its privilege to the highest possible level (since BlackBox is specific to the Windows operating system, this is the SYSTEM privilege, equivalent to root on a Unix/Linux machine). The BlackBox applet can be easily customized to download any program to the infected machine and then run it. This applet is thus not only an exploit in itself; it is also a delivery vehicle for an arbitrary malicious payload. The BlackBox applet will be analyzed using the methodology presented in this paper. In order to understand how these security failures come about, and also to understand the motivations for developing the analysis tool described in this paper, it is first necessary to briefly review the Java security model. Java security is enforced in three ways. The Java compiler has a large number of rules that it enforces in order to ensure that the syntax and semantics of the Java language are satisfied, but also to prohibit certain actions that are known to be associated with malicious code. For example, the Java compiler will refuse to compile any program that contains a method that makes use of an uninitialized variable. The output of the Java compiler is a binary file known as a classfile. In order for a Java application or applet to use the methods provided by a class, it must load the classfile that contains that class into the process or thread that is utilizing its methods. Loading is accomplished by a Java classloader. Every Java classloader will implicitly invoke the Java Bytecode Verifier. The Bytecode Verifier checks that the contents of the classfile conform to the classfile format. More importantly, the Bytecode Verifier also verifies a large number of security constraints before it will allow the classloader to succeed. The final part of Java security enforcement is handled by the Java runtime, which performs array bounds checking, runtime type conversion checking and a number of other tests. Almost all Java exploits to date have used weaknesses in the Bytecode Verifier. The Bytecode Verifier is a part of the JVM, and the rules that it checks when analyzing a classfile are described in great detail in the JVM specification [JV]. The Bytecode Verifier uses a constraint based approach in performing its analysis. For example, it checks that all local variables are written before being read, that each instruction receives precisely the set of operands that it is expecting, that the stack has the same depth at each program point regardless of execution path used to reach that program point, and many other constraints. Our approach uses Alloy to perform constraint analysis on Java bytecodes. It attempts to emulate the constraint checking that is ostensibly being performed by the Bytecode Verifier. In Alloy it is very easy to express constraints in terms of formulas involving relations, and therefore it has proven to be a rich environment for checking Java security constraints. Some previous efforts have been made to apply formal methods to Java bytecodes [XU], but these efforts have used a more heavyweight model checking approach that attempts to prove soundness, as opposed to Alloy’s lightweight constraint based approach that converts assertions into Boolean formulas and then searches for satisfaction assignments or the existence of counterexamples. Goals. This work described in this paper has three goals: (1) to provide an extensible framework for modeling security constraints imposed by the JVM’s Bytecode Verifier; and (2) to provide a concrete model for as many security constraints as possible, and (3) to demonstrate that the analysis tool does check them correctly. It would be straightforward to use Alloy to create a model for a specific block of Java bytecode. While this might serve as the demonstration of the applicability of Alloy to security analysis of the JVM, this would have little value in analyzing compliance with the JVM security constraints as a whole. Therefore, it is desirable to have an extensible model. In this context “extensible” means that the model must have the ability to be applied to any block of JVM code and to perform analysis on that code against a specified set of constraints. In the Design section it will be shown how this goal was realized. Several of the security constraints imposed by the JVM have already been mentioned. As indicated above, the JVM checks against a substantial number of such constraints. In general, most constraints are independent of one another, although there are some functional overlaps, as will be demonstrated below. In order to prove the soundness of the basic concept, it was deemed prudent to select a realistic subset of the total set of JVM security constraints and begin with a simple model that would encompass that reduced subset of constraints. In developing the model for this initial set of constraints, with the extensibility goal in mind, a general framework for code analysis was created such that adding additional constraints would involve only incremental modifications, and not a complete restructuring of the model code. The current implementation concretely models a small set of security constraints. While certain technical challenges, such as providing a complete model of exception handling, have not yet been addressed, the work to date strongly suggests that the current implementation can be readily adapted to additional constraints. A survey of next steps is given in the Future Work section of this paper. Design. Alloy is a lightweight modeling language that uses first order logic. In Alloy the concept of a “relation” is central. Alloy is capable of analyzing assertions for satisfiability and also for the existence of counterexamples. A key observation is that the security constraints imposed by the JVM can be modeled as invariants, and thus can be analyzed by the Alloy Analyzer. Alloy is not a proof system, so the failure to find a counterexample to a constraint is not a proof that that constraint is satisfied, only that the constraint is satisfied within the search space specified. If a counterexample is found, however, that does indicate that the invariant has been violated, and the Alloy Analyzer conveniently provides a graphical representation of that counterexample. In light of the extensibility goal described in the previous section, the initial design problem was to find an implementation of the Alloy model that would capture the invariants of interest abstractly, independent of any actual JVM code, but would then permit the model to be run against any concrete realization of such JVM code. Initial experimentation with Alloy suggested two possible approaches: automatically generate Alloy functions, facts or predicates based on the JVM code to be analyzed, or automatically generate Alloy statements that initialized relations based on the JVM code to be analyzed. In order to realize a classical code/data separation, it was decided to use the latter approach. Thus, the Alloy model would be realized as a template containing a fixed set of relations, functions, facts, predicates and assertions. This model would then be supplemented by relation initializers that would be derived from particular JVM code. In this approach, the template portion of the Alloy model would be completely independent of any choice of Java bytecodes, while the initializers would depend only weakly on the detailed implementation of the template. Specifically, the initializers being generated would only depend on the set of relations being initialized, and not on any specific way in which the constraints were realized in the model template. This decoupling between the “data” portion of the model and the “code” portion of the model is the means by which the stated extensibility goal has been achieved. Further requirements analysis revealed that these two top level components, the model template and the initializers, could be further refined into four components: (1) the relation definitions; (2) the relation initializers; (3) the execution engine; and (4) the constraint assertions. The relation definitions, execution engine and constraint assertions are all part of the Alloy model template. The relation definitions are Alloy definitions of the top level signatures, which are atoms which contain relations, as well as the definitions of the relations themselves. As will be seen in the Implementation section, these relation definitions capture the static properties of individual JVM instructions, as well as capturing the JVM state as the execution engine executes. All other components of the Alloy model are logically dependent on the relation definitions. The relation initializers are the initial values of the Alloy relations. They are generated from specific JVM code, and vary from one invocation of the model to the next. An initial design decision was made to capture JVM code at the method level. This, of course, is a trade off between performance and granularity. It is certainly possible to model multiple methods within a single model. However, the time that Alloy takes to analyze a particular model is strongly dependent on the number of (program execution) states, which, in turn is strongly dependent on the size of the relation initializers. As will be seen below, the actual Alloy model template is quite suited to analyzing code blocks within a method, and could be extended to handle multiple methods. Relation initializers need to be generated from specific Java methods. Therefore, there needs to be an automatic way of converting the Java bytecodes in a method into these relation initializers. To this end, a Java classfile parser was created to perform this conversion. The parser takes a Java classfile as input and produces an Alloy model fragment as output. When the model fragment is combined with the Alloy template, a complete Alloy model is produced, as is shown in Figure 1 below. The relation definitions and their initializers form a static representation of a set of properties of the Java method being analyzed. In order to observe dynamic behavior, this static representation needed to be extended with model actions that would mimic the execution of the JVM itself, at least to the extent that the JVM’s Bytecode Verifier would synthetically execute method code in order to perform its own constraint checking. Thus, an execution engine was needed. This execution Figure 1: Constructing a complete Alloy model using the classfile parser engine would represent the flow of execution through the medium of stateful relations. Alloy’s “ordering” utility is used for representing this state. Execution could not be unbounded, of course, since Alloy only performs analysis over a finite set of states. It would have been possible to simply let Alloy “fall off the end” of execution, which is to say to allow the analyzer to perform an exhaustive analysis of all possible states in the state space. For both performance and structural reasons this was deemed to be an unacceptable solution. Therefore, the execution engine was designed such that certain JVM instructions are designated as terminal instructions. (Any type of return instruction would be terminal, for example.) The execution engine was then implemented to recognize this condition and act on it in such a way as to create no further unique states. Of course, this models the actual execution of the JVM itself. Certain instructions within a method are, in fact, terminal, in that they cause the method to be exited. One obvious question is the manner in which iterative constructs are handled by the execution engine. Would it provide better model fidelity to have the execution engine attempt to exactly mimic runtime execution, or would this lead to unacceptable performance penalties? In fact, the execution engine does not attempt to perform any branch prediction analysis in the model. The precise way in which this was handled, and its implications, will be explained in the Implementation section below. Finally, the model must provide for a way in which each JVM security constraint is actually checked by Alloy. Formulating the security constraints as Alloy assertions proved to be straightforward once the model had been constructed to accurately reflect the static and dynamic properties of the method code. Implementation. The implementation of the JVM security constraints analyzer will be described in three subsections. In the first subsection, the three components of the model template, namely the relation definitions, the execution engine, and the security constraint assertions, will be described. In the second subsection, the implementation of the Class2Alloy classfile parser which is used to generate the relation initializers will be discussed. In the third section a concrete example will be dissected, including a description of the parser invocation and subsequent model analysis. The example in question is a reduced form of the BlackBox applet. Model Template. The model template employs two top level signatures, an “Instruction” signature and a “State” signature. The Instruction signature is made abstract in order that each of the individual instructions that make up a method can be defined as concrete, atomic extensions of this abstract signature. Intuitively, this is reasonable because the properties (relations) of instructions vary from instruction to instruction, but are still static for any particular instruction. For example, the length of a given instruction in bytes is fixed for all time once the instruction is specified, but obviously varies between instructions. The “State” signature is derived from Alloy’s ordering utility, which predefines certain relations such as “first”, “next” and “last”. The State signature is dynamic, and the values of its relations are updated by the execution engine as it executes during analysis. The Alloy definition of these two signatures is shown below. abstract sig Instruction { map: Int,sig Instruction { map: Int,

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lightweight Modeling of Java Virtual Machine Security Constraints

The Java programming language has been widely described as secure by design. Nevertheless, a number of serious security vulnerabilities have been discovered in Java, particularly in the component known as the Bytecode Verifier. This paper describes a method for representing Java security constraints using the Alloy modeling language. It further describes a system for performing a security analy...

متن کامل

Modeling the Java Bytecode Verifier

The Java programming language has been widely described as secure by design. Nevertheless, a number of serious security vulnerabilities have been discovered in Java, particularly in the Bytecode Verifier, a critical component used to verify class semantics before loading is complete. This paper describes a method for representing Java security constraints using the Alloy modeling language. It f...

متن کامل

Comparing Java and .NET security: Lessons learned and missed

Many systems execute untrusted programs in virtual machines (VMs) to mediate their access to system resources. Sun introduced the Java VM in 1995, primarily intended as a lightweight platform for executing untrusted code inside web pages. More recently, Microsoft developed the .NET platform with similar goals. Both platforms share many design and implementation properties, but there are key dif...

متن کامل

Magic Potion : A Metalanguage for Incorporating

if your preferred environment requires only a few features from another paradigm, you must typically adopt the whole alien platform to take advantage of them. The alternative of using other languages and tools to implement the features in a way that avoids adding the whole platform is generally at least as difficult. But a more affordable solution is often possible. We used metaprogramming to i...

متن کامل

VAlloy - Virtual Functions Meet a Relational Language

We propose VAlloy, an extension to the first order, relational language Alloy. Alloy is suitable for modeling structural properties of object-oriented software. However, Alloy lacks support for dynamic dispatch, i.e., function invocation based on actual parameter types. VAlloy introduces virtual functions in Alloy, which enables intuitive modeling of inheritance. Models in VAlloy are automatica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008